Use watsonx, and LangChain to make a series of calls to a language model¶

Disclaimers¶

Use only Projects and Spaces that are available in watsonx context.

Notebook content¶

This notebook contains the steps and code to demonstrate Sequential Chain using langchain integration with watsonx models.

Some familiarity with Python is helpful. This notebook uses Python 3.11.

Learning goal¶

The goal of this notebook is to demonstrate how to chain google/flan-ul2 and google/flan-t5-xxl models to generate a sequence of creating a random question on a given topic and an answer to that question and also to make the user friends with LangChain framework, using simple chain (LLMChain) and the extended chain (SequentialChain) with the WatsonxLLM.

Contents¶

This notebook contains the following parts:

Setup
Foundation Models on watsonx
LangChain integration
Sequential Chain experiment
Python function
Custom inference endpoint
Scoring
Summary

Set up the environment¶

Before you use the sample code in this notebook, you must perform the following setup tasks:

Create a Watson Machine Learning (WML) Service instance (a free plan is offered and information about how to create the instance can be found here).

Install and import the `datasets` and dependecies¶

In [ ]:

!pip install -U "langchain>=0.3,<0.4" | tail -n 1
!pip install -U "langchain_ibm>=0.3,<0.4" | tail -n 1

Defining the WML credentials¶

This cell defines the WML credentials required to work with watsonx Foundation Model inferencing.

Action: Provide the IBM Cloud user API key. For details, see documentation.

In [2]:

import getpass
from ibm_watsonx_ai import Credentials

credentials = Credentials(
    url="https://us-south.ml.cloud.ibm.com",
    api_key=getpass.getpass("Please enter your WML api key (hit enter): "),
)

Defining the project id¶

The Foundation Model requires project id that provides the context for the call. We will obtain the id from the project in which this notebook runs. Otherwise, please provide the project id.

In [3]:

import os

try:
    project_id = os.environ["PROJECT_ID"]
except KeyError:
    project_id = input("Please enter your project_id (hit enter): ")

Create an instance of APIClient with authentication details.

In [5]:

from ibm_watsonx_ai import APIClient

api_client = APIClient(credentials=credentials, project_id=project_id)

Foundation Models on `watsonx.ai`¶

List available models¶

All avaliable models are presented under TextModels class.

In [8]:

api_client.foundation_models.TextModels.show()

{'MT0_XXL': 'bigscience/mt0-xxl', 'CODELLAMA_34B_INSTRUCT_HF': 'codellama/codellama-34b-instruct-hf', 'JAIS_13B_CHAT': 'core42/jais-13b-chat', 'ELYZA_JAPANESE_LLAMA_2_7B_INSTRUCT': 'elyza/elyza-japanese-llama-2-7b-instruct', 'FLAN_T5_XL': 'google/flan-t5-xl', 'FLAN_T5_XXL': 'google/flan-t5-xxl', 'FLAN_UL2': 'google/flan-ul2', 'GRANITE_13B_CHAT_V2': 'ibm/granite-13b-chat-v2', 'GRANITE_13B_INSTRUCT_V2': 'ibm/granite-13b-instruct-v2', 'GRANITE_20B_CODE_INSTRUCT': 'ibm/granite-20b-code-instruct', 'GRANITE_20B_MULTILINGUAL': 'ibm/granite-20b-multilingual', 'GRANITE_3_2B_INSTRUCT': 'ibm/granite-3-2b-instruct', 'GRANITE_3_8B_INSTRUCT': 'ibm/granite-3-8b-instruct', 'GRANITE_34B_CODE_INSTRUCT': 'ibm/granite-34b-code-instruct', 'GRANITE_3B_CODE_INSTRUCT': 'ibm/granite-3b-code-instruct', 'GRANITE_7B_LAB': 'ibm/granite-7b-lab', 'GRANITE_8B_CODE_INSTRUCT': 'ibm/granite-8b-code-instruct', 'GRANITE_8B_JAPANESE': 'ibm/granite-8b-japanese', 'GRANITE_GUARDIAN_3_2B': 'ibm/granite-guardian-3-2b', 'GRANITE_GUARDIAN_3_8B': 'ibm/granite-guardian-3-8b', 'LLAMA_2_13B_CHAT': 'meta-llama/llama-2-13b-chat', 'LLAMA_3_1_70B_INSTRUCT': 'meta-llama/llama-3-1-70b-instruct', 'LLAMA_3_1_8B_INSTRUCT': 'meta-llama/llama-3-1-8b-instruct', 'LLAMA_3_2_11B_VISION_INSTRUCT': 'meta-llama/llama-3-2-11b-vision-instruct', 'LLAMA_3_2_1B_INSTRUCT': 'meta-llama/llama-3-2-1b-instruct', 'LLAMA_3_2_3B_INSTRUCT': 'meta-llama/llama-3-2-3b-instruct', 'LLAMA_3_2_90B_VISION_INSTRUCT': 'meta-llama/llama-3-2-90b-vision-instruct', 'LLAMA_3_405B_INSTRUCT': 'meta-llama/llama-3-405b-instruct', 'LLAMA_3_70B_INSTRUCT': 'meta-llama/llama-3-70b-instruct', 'LLAMA_3_8B_INSTRUCT': 'meta-llama/llama-3-8b-instruct', 'LLAMA_GUARD_3_11B_VISION': 'meta-llama/llama-guard-3-11b-vision', 'MISTRAL_LARGE': 'mistralai/mistral-large', 'MIXTRAL_8X7B_INSTRUCT_V01': 'mistralai/mixtral-8x7b-instruct-v01', 'PIXTRAL_12B': 'mistralai/pixtral-12b', 'LLAMA2_13B_DPO_V7': 'mncai/llama2-13b-dpo-v7', 'ALLAM_1_13B_INSTRUCT': 'sdaia/allam-1-13b-instruct'}

You need to specify model_id's that will be used for inferencing:

In [9]:

model_id_1 = "google/flan-ul2"
model_id_2 = "google/flan-t5-xxl"

Defining the model parameters¶

You might need to adjust model parameters for different models or tasks, to do so please refer to documentation under GenTextParamsMetaNames class.

Action: If any complications please refer to the documentation.

In [10]:

from ibm_watsonx_ai.metanames import GenTextParamsMetaNames as GenParams
from ibm_watsonx_ai.foundation_models.utils.enums import DecodingMethods

parameters = {
    GenParams.DECODING_METHOD: DecodingMethods.SAMPLE.value,
    GenParams.MAX_NEW_TOKENS: 100,
    GenParams.MIN_NEW_TOKENS: 1,
    GenParams.TEMPERATURE: 0.5,
    GenParams.TOP_K: 50,
    GenParams.TOP_P: 1
}

LangChain integration¶

WatsonxLLM is a wrapper around watsonx.ai models that provide chain integration around the models.

Action: For more details about CustomLLM check the LangChain documentation

Initialize the `WatsonxLLM` class.¶

In [11]:

from langchain_ibm import WatsonxLLM

flan_ul2_llm = WatsonxLLM(
    model_id=model_id_1,
    url=credentials["url"],
    apikey=credentials["apikey"],
    project_id=project_id,
    params=parameters
    )
flan_t5_llm = WatsonxLLM(
    model_id=model_id_2,
    url=credentials["url"],
    apikey=credentials["apikey"],
    project_id=project_id
    )

You can print all set data about the WatsonxLLM object using the dict() method.

In [12]:

flan_ul2_llm.dict()

Out[12]:

{'model_id': 'google/flan-ul2',
 'deployment_id': None,
 'params': {'decoding_method': 'sample',
  'max_new_tokens': 100,
  'min_new_tokens': 1,
  'temperature': 0.5,
  'top_k': 50,
  'top_p': 1},
 'project_id': 'd2436d2e-5814-4370-bd06-1754670e7d46',
 'space_id': None,
 '_type': 'IBM watsonx.ai'}

Sequential Chain experiment¶

The simplest type of sequential chain is called a SequentialChain, in which each step has a single input and output and the output of one step serves as the input for the following step.

The experiment will consist in generating a random question about any topic and answer the following question.

An object called PromptTemplate assists in generating prompts using a combination of user input, additional non-static data, and a fixed template string.

In our case we would like to create two PromptTemplate objects which will be responsible for creating a random question and answering it.

In [13]:

from langchain_core.prompts import PromptTemplate

prompt_1 = PromptTemplate(
    input_variables=["topic"], 
    template="Generate a random question about {topic}: Question: "
)
prompt_2 = PromptTemplate(
    input_variables=["question"],
    template="Answer the following question: {question}",
)

We would like to add functionality around language models using LLMChain chain.

prompt_to_flan_ul2 chain formats the prompt template whose task is to generate random question, passes the formatted string to LLM and returns the LLM output.

In [ ]:

from langchain.chains import LLMChain

prompt_to_flan_ul2 = LLMChain(llm=flan_ul2_llm, prompt=prompt_1, output_key='question')

flan_to_t5 chain formats the prompt template whose task is to answer the question we got from prompt_to_flan_ul2 chain, passes the formatted string to LLM and returns the LLM output.

In [15]:

flan_to_t5 = LLMChain(llm=flan_t5_llm, prompt=prompt_2, output_key='answer')

This is the overall chain where we run prompt_to_flan_ul2 and flan_to_t5 chains in sequence.

In [16]:

from langchain.chains import SequentialChain

qa = SequentialChain(chains=[prompt_to_flan_ul2, flan_to_t5], input_variables=["topic"], output_variables=['question', 'answer'], verbose=True)

Generate random question and answer to topic.

In [17]:

qa.invoke({"topic": "life"})


> Entering new SequentialChain chain...

> Finished chain.

Out[17]:

{'topic': 'life',
 'question': 'What is the name of the largest zoo in the world?',
 'answer': 'zoo sao paulo brazil'}

Python function¶

Let's wrap the chain code within python function that can be used as a content of inference endpoint.

Function implementation.¶

Let's wrap the above chain code into function.

In [18]:

ai_params = {
    "url": credentials["url"],
    "apikey": credentials["apikey"],
    "project_id": project_id,
    "generation_parameters": parameters
}

In [19]:

def chain_text_generator(params=ai_params):
    
    from ibm_watsonx_ai.foundation_models.utils.enums import ModelTypes
    from langchain import PromptTemplate
    from langchain.chains import LLMChain, SequentialChain
    from langchain_ibm import WatsonxLLM

    url = params["url"]
    apikey = params["apikey"]
    project_id = params['project_id']
    parameters = params['generation_parameters']
    flan_ul2_llm = WatsonxLLM(model_id=ModelTypes.FLAN_UL2.value, url=url, apikey=apikey, project_id=project_id, params=parameters)
    flan_t5_llm = WatsonxLLM(model_id=ModelTypes.FLAN_T5_XXL.value, url=url, apikey=apikey, project_id=project_id)
    prompt_1 = PromptTemplate(input_variables=["topic"], template="Generate a random question about {topic}: Question: ")
    prompt_2 = PromptTemplate(input_variables=["question"], template="Answer the following question: {question}")
    prompt_to_flan_ul2 = LLMChain(llm=flan_ul2_llm, prompt=prompt_1, output_key='question')
    flan_to_t5 = LLMChain(llm=flan_t5_llm, prompt=prompt_2, output_key='answer')
    chain = SequentialChain(chains=[prompt_to_flan_ul2, flan_to_t5], input_variables=["topic"], output_variables=['question', 'answer'])

    def score(payload):
        """Generates question based on provided topic and returns the answer."""

        answer = chain.invoke({"topic": payload["input_data"][0]['values'][0][0]})
        return {'predictions': [{'fields': ['topic', 'question', 'answer'], 'values': [answer['topic'], answer['question'], answer['answer']]}]}

    return score

Test the function¶

It is good practice to validate the code locally first.

In [20]:

sample_payload = {
    "input_data": [
        {
            "fields": ["topic"],
            "values": [["life"]]
        }
    ]
}

inference = chain_text_generator()
inference(sample_payload)

Out[20]:

{'predictions': [{'fields': ['topic', 'question', 'answer'],
   'values': ['life',
    'What is the term for the process of a living thing gaining nutrients from dead organisms?',
    'decomposition']}]}

Custom inference endpoint¶

Create the online deployment of python function.

Custom software specification¶

Create new software specification based on default Python 3.11 environment extended by langchain package.

In [19]:

config_yml =\
"""
name: python311
channels:
  - empty
dependencies:
  - pip:
    - langchain_ibm==0.3.3
    - langchain==0.3.7
prefix: /opt/anaconda3/envs/python311
"""

with open("config.yaml", "w", encoding="utf-8") as f:
    f.write(config_yml)

In [20]:

space_id = 'PASTE YOUR SPACE ID HERE'

Now you need to store new package extension with APIClient.

In [21]:

from ibm_watsonx_ai import APIClient

client = APIClient(credentials)
client.set.default_space(space_id)
base_sw_spec_id = client.software_specifications.get_id_by_name("runtime-24.1-py3.11")
meta_prop_pkg_extn = {
    client.package_extensions.ConfigurationMetaNames.NAME: "langchain watsonx.ai env",
    client.package_extensions.ConfigurationMetaNames.DESCRIPTION: "Environment with langchain",
    client.package_extensions.ConfigurationMetaNames.TYPE: "conda_yml"
}

pkg_extn_details = client.package_extensions.store(meta_props=meta_prop_pkg_extn, file_path="config.yaml")
pkg_extn_id = client.package_extensions.get_id(pkg_extn_details)
pkg_extn_url = client.package_extensions.get_href(pkg_extn_details)

Creating package extensions
SUCCESS

Create new software specification and add created package extension to it.

In [22]:

meta_prop_sw_spec = {
    client.software_specifications.ConfigurationMetaNames.NAME: "langchain watsonx.ai custom software specification",
    client.software_specifications.ConfigurationMetaNames.DESCRIPTION: "Software specification for statsmodels",
    client.software_specifications.ConfigurationMetaNames.BASE_SOFTWARE_SPECIFICATION: {"guid": base_sw_spec_id}
}

sw_spec_details = client.software_specifications.store(meta_props=meta_prop_sw_spec)
sw_spec_id = client.software_specifications.get_id(sw_spec_details)
client.software_specifications.add_package_extension(sw_spec_id, pkg_extn_id)

SUCCESS

Out[22]:

'SUCCESS'

Store the function¶

In [23]:

meta_props = {
    client.repository.FunctionMetaNames.NAME: "SequenceChain LLM function",
    client.repository.FunctionMetaNames.SOFTWARE_SPEC_ID: sw_spec_id
}

function_details = client.repository.store_function(meta_props=meta_props, function=chain_text_generator)
function_id = client.repository.get_function_id(function_details)

Create online deployment¶

In [24]:

metadata = {
    client.deployments.ConfigurationMetaNames.NAME: "Deployment of LLMs chain function",
    client.deployments.ConfigurationMetaNames.ONLINE: {}
}

function_deployment = client.deployments.create(function_id, meta_props=metadata)


######################################################################################

Synchronous deployment creation for id: 'd71113c7-bcc7-4adf-ba23-fe8ea3c120c6' started

######################################################################################


initializing
Note: online_url and serving_urls are deprecated and will be removed in a future release. Use inference instead.
...........
ready


-----------------------------------------------------------------------------------------------
Successfully finished deployment creation, deployment_id='8d7ec29e-51d5-40ae-813b-3d37102ded54'
-----------------------------------------------------------------------------------------------

Scoring¶

Generate text using custom inference endpoint.

In [25]:

deployment_id = client.deployments.get_id(function_deployment)
client.deployments.score(deployment_id, sample_payload)

Out[25]:

{'predictions': [{'fields': ['topic', 'question', 'answer'],
   'values': ['life',
    'What is the name of the smallest unit of life?',
    'cell']}]}

Summary and next steps¶

You successfully completed this notebook!.

You learned how to use Sequential Chain using custom llm WastonxLLM.

Check out our Online Documentation for more samples, tutorials, documentation, how-tos, and blog posts.

Authors:¶

Lukasz Cmielowski, PhD, Senior Technical Staff Member at Watson Machine Learninig.

Mateusz Szewczyk, Software Engineer at Watson Machine Learning.